Matrix Games, Mixed Strategies, and Statistical Mechanics 
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Matrix games constitute a fundamental problem of game theory and describe a situation of two 
players with completely conflicting interests. We show how methods from statistical mechanics can 
be used to investigate the statistical properties of optimal mixed strategies of large matrix games 
with random payoff matrices and derive analytical expressions for the value of the game and the 
distribution of strategy strengths. In particular the fraction of pure strategies not contributing to 
the optimal mixed strategy of a player is calculated. Both independently distributed as well as 
correlated elements of the payoff matrix are considered and the results compared with numerical 
simulations. 
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Game theory models in mathematical terms prob- 
lems of strategic decision-making typically arising in eco- 
nomics, sociology, or international relations and owes 
much of its modern form to J. von Neumann Q. The 
generic situation in game theory consists of a set of 
players {X, Y, . . .} choosing between different strate- 
gies {Xi}, {Yi}, . . ., the combination of which deter- 
mines the outcome of a game specified by the payoffs 
P x (Xi,Yi, . . .), P Y (Xi, Yi, ...),... each player is going to 
receive. The payoffs depend on the strategies of all play- 
ers and the problem for every individual player is to 
choose his strategy such as to optimize his payoff with- 
out having control over the strategies of all other play- 
ers. Despite the extreme simplification of the real world 
situation inherent in this framework, game theory has 
proven not only to be a viable mathematical discipline 
but also to be able to characterize important features of 
economical systems. Many interesting results have been 
obtained since von Neumann's pioneering work including 
the characterization of equilibria jl],^] and the emergence 
of cooperation 0. However, detailed investigations have 
been restricted either to general statements concerning 
e.g., the existence of equilibria, or to situations where 
every player has only a small number of strategies at 
his disposal and where the payoffs are simple functions 
of these strategies. As many situations of interest show 
a large number of possible strategies and rather com- 
plicated relationships between strategic choices and the 
resulting payoffs, it is tempting to model the payoffs by 
a random function and to apply the methods of statis- 
tical mechanics to describe the properties of the game. 
This will be a sensible approach if there are characteris- 
tic "macroscopic" quantities which do not depend on the 
particular realization of the random parameters, i.e. are 
self- averaging in the sense of the statistical mechanics of 
disordered systems j|, for related applications see ||. 

In the present letter we show how methods from sta- 
tistical mechanics can be applied to characterize the sta- 
tistical properties of optimal strategies in matrix games 
with large randomly chosen payoff matrices. Explicitly 
we calculate the mean payoff and the fraction of pure 



strategies which occur in the optimal mixed strategy of 
a player. For simplicity we restrict ourselves to matrix 
games, the type of zero-sum games between two play- 
ers which also forms the basis of von Neumann's treat- 
ment [00]. Such games are defined by a (not necessar- 
ily square) payoff-matrix cy: Player X may choose be- 
tween N strategies Xi and player Y between M strate- 
gies Yj where i = 1, . . . , N and j = 1, . . . , M. At each 
step of this game they receive the payoffs Px {Xi ,Yj) = 



As player X wishes to gain as large 



a payoff as possible, whereas player Y must attempt 
to reach as small a value of &, in order to maximize 



his payoff P Y {X i ,Y j ] 



the goals of the players 



are completely conflicting. Thus it is appropriate for the 
players to proceed as follows: Player X knows that when 
playing strategy Xi he will receive at least the payoff 
He therefore chooses strategy X%* satisfying 



Equivalently, player Y plays 



strategy Yj* determined by maxj Cjj* 
since it minimizes his losses for the optimal choices of 
X . It is easy to show that max^ mim, Cij < miiij max^ Cy 
always. The situation is simple if the matrix has a so- 
called saddle-point, i.e. if there is a pair i*,j* satisfying 
maxi mirij cy = Ci*j* = minj maxj Cy . In this case it is 
optimal for both players to stick to their pure strategies 
Xi* and Y^ respectively, since deviations from an optimal 
strategy by one of the players will lead to a lower payoff 
for this player. For a large random matrix c the proba- 
bility for the existence of a such a saddle point vanishes 
exponentially with the size of the matrix and the choice 
of an optimal strategy is less obvious. Since in this case 
maxi minj cy < minj max^ cy , player X will attempt to 
achieve a greater gain than his guaranteed minimal gain 
maxi minj cy and likewise Y will attempt to achieve a 
smaller loss than mim, max^ cy . To this end they have 
to prevent their opponent from guessing which strategy 
they are going to play and choose each strategy with a 
certain probability Xi and yj respectively jlj. A vector 
Xi of probabilities is called a mixed strategy and by the 
normalization condition is constrained to lie on the N- 
dimcnsional simplex. The famous minimax theorem by 
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von Neumann states that for any payoff matrix c there 
exists a saddle point of mixed strategies, i.e. there are 
two vectors x* and y* such that 



max min > 
i 1 ! b) ! 4^ 

%3 



x i Cij y 3 = V cy y* = mm max V Xl cy yj 

iVji i ' I 



The expected payoff for the optimal mixed strategies 
v c :— Y]jj x* Cij y* is called the value of the game and 
x*,y* denote optimal mixed strategies of player X and Y 
since again deviations from an optimal strategy by one 
of the players will lead to a lower payoff for this player. 

In the following we show how the statistical proper- 
ties of such optimal mixed strategies for random payoff 
matrices may be characterized analytically in the limit 
N — > oo, M — » oo with M/N — a — 0(1). As is gener- 
ally the case in fully connected disordered systems, only 
the first two cumulants of the probability distribution 
P({cij}) are relevant. Since an average value ((c)) of the 
elements of the payoff matrix only results in a modified 
value of the game v c + ((c)) without changing the opti- 
mal mixed strategies, we may set ((c)) = without loss 
of generality and take the elements cy to be indepen- 
dent Gaussian distributed variables with zero mean and 
variance TV -1 . 

We then note |^| that a necessary and sufficient condi- 
tion for the mixed strategy {xi} of player X to be optimal 
is 



E 



XiCij > v c Vj 



(1) 



The condition is necessary since if violated for some j 
player Y playing Yj will lead to a payoff lower than v c . 
It is also sufficient since combining (|l]) with the minimax 
theorem gives Ylij x i c ij Vj — v c- We may thus charac- 
terize mixed strategies of player X by introducing the 
partition function 



Z{y) 



JY 



N 



N) 



(2) 



where Q(x) is the Heaviside step-function and the prob- 
abilities of playing a given strategy and the payoff have 
been rescaled so that Y^iLi Xi — N for convenience. 
Thus Z{y) equals the fraction of the simplex obeying 
J2i x i c v ^ v an( i therefore lies on the interval [0,1]. 
Since Z(y) scales exponentially with N, the quantity cen- 
tral to our calculation is the entropy S{y) := l/NhiZ(y), 
which in general will be negative as usual for classical 
systems with continuous degrees of freedom. 

Assuming the entropy S{y) to be self-averaging, we 
use the replica-trick h\Z = lim n _>o 7[~Z n and compute 
the average over the payoffs of the replicated partition 
function for integer n {a,b — l...n). The calculation 



proceeds by using the integral representation of the Heav- 
iside step- function and by introducing the symmetric ma- 
trix of overlap order parameters q a b = ^/^Y^i x t x i v ^ a 
integrals over g a j, and delta-functions represented by in- 
tegrals over the conjugate order parameters q a b 0- The 
integrals over E a arise from the integral representation 
of the constraint J2i xf — N giving 

dEa ( 3 ) 
2n/N 1J -y 2tt/A v ' 



a>b 

iN^Ea) 



n 



i>6 



dxf exp(i 



Q^ x i x i 

a>b.i 



dA ° / -£ exp( - 2 S qabV ^ + 1 £ 

a, j - ' a ,b,j a,j 

In the limit of large payoff matrices N — + oo the inte- 
grals over order parameters are dominated by their sad- 
dle point. Throughout this paper we use the replica- 
symmetric ansatz ||] 

q aa = q\ iq aa = -l/2qi iE a = E Va (4) 

q a b = qo iqab = qo Va > b . 

The limit n — > of (|3j) may now be taken by analytic 
continuation giving an entropy 

S{u) = extr gi>g0) .E )5l)5b [igigi + ^q q -E + i ln(27r) 
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(5) 
Nu- 



2(?i + qo) ' J " v Vqi + qo 

where Ds = ^= cxp(-s 2 /2), and H(x) = f™ Ds 
merical evaluation of (|5|) shows that S(y) is a contin- 
uously decreasing function of v. At v c it tends to — oo, 
indicating that for larger values of v there are no more so- 
lutions to ([!]). Furthermore as v — > v c one finds qo — > qi 
indicating that as the points contributing to (|^) crowd 
into an ever decreasing area of the simplex which shrinks 
to a point at v c their mutual overlap go approaches the 
self-overlap gi. 

In this regime the entropy may be conveniently written 
in terms of the order parameters go, So, E, w = gi +go and 
v = qi — go. For v < v c S(y) describes sub-optimal strate- 
gies. As v — > we find go ~ v ~ 2 i w ~ v . Rescaling the 
conjugate order parameters accordingly and expanding 
the saddle-point equations to leading order in v as v — > 
we find 



w - aH (-fc/^/qo) = 
w - H(-E/^q~ ) = 



(6) 



go - (v 2 c +q )w- ay/%v c G(-v c / y/%) = 
go - {E 2 + q )/w - ^E/w 2 G(-E/^) = 



2 



with E = qqw — go- 

The statistical properties of optimal strategies {x° pt } 
may be deduced from the proportion of strategies Xj with 

Xi > a 

9(a) := ((1/N^2Q(xT -a))) = H(^S) . (7) 

i 

Thus only a fraction 9(0) = w of the pure strategies X, 
have > and are played with non-zero probability. 
This striking effect may be explained by considering the 
behaviour of player Y , whose optimal mixed strategy y* 
obeys A* = J2j c ijVj < v c Vi. Since v c = l/N^^X*, 
x* must be zero if A* < v c . This mechanism thus ensures 
an expected payoff !/ c to X, even if Y chooses an optimal 
strategy. However it is not to be confused with the con- 
cept of domination, widely discussed in the game theory 
literature , where a strategy Xi has x. k = because 

whatever the response of the opponent some other pure 
or mixed strategy will lead to a higher expected pay- 
off. In fact in the thermodynamic limit domination of 
a pure strategy occurs with probability zero since for a 
pure strategy Xk to be dominated by a mixed strategy 
xf requires l/NJ2i x F c ij > c kjVj but the lhs is 0(N^ X ) 
whereas the rhs is 0(iV -1 / 2 ). 
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FIG. 1. The value of the game v c and (inset) the fraction of 
strategies played with non-zero probability #(0) as a function 
of a. The analytical results (full line) are compared to nu- 
merical simulations with N=200 averaged over 200 samples. 
The symbol size corresponds to the statistical error. 

Figure |l| shows the value of the game and (inset) the 
fraction of strategies played with non-zero probability as 
a function of the aspect ratio a of the payoff matrix. At 
a = 1 v c = and 9(0) = 1/2. The result v c = at a = 1 
is a consequence of the symmetry of the distribution of 
payoffs under cy — ► —Cji, i.e. under interchange of player 
X and player Y Jl(J. For a > 1 player Y has a greater 



choice of strategies than player X and vice versa. As 
expected, the payoff to player X decreases as the range 
of strategy choices of player Y increases. The fraction of 
strategies played with non-zero probability increases with 
a, which reflects the decrease of v c with a: At lower v c 
there are fewer i with A* = J2j c ijVj < v c, so as argued 
above the number of strategies Xi played with non-zero 
probability increases as a result. 

We next abandon the initial assumption that the indi- 
vidual entries Cij in the payoff-matrix are independently 
distributed and consider the case where the outcomes of 
the game for different strategy choices of the players are 
correlated with each other. Such correlations may arise 
quite naturally in real applications since we expect some 
strategies to have broadly similar properties and hence 
yield similar results for a given response of the respective 
opponent. For simplicity we restrict the discussion to the 
case a = 1. The most general tractable case appears to 
be 

((ctfC«))/(((cij»((cw))) = : C{imi) = Cf fc CJ/ (8) 

where Cf k and CJ; refer to column- and row-like correla- 
tions. Of course P({cij}) is not uniquely determined by 
its second moments, but as argued above it suffices to 
consider Gaussian distributed payoff matrices so 

v( 27r ) in ijU 

(9) 

The specific form of C? k and CJ ; we will consider in the 
following is 

C ^ = { c c , r /N i £ k ( 10 ) 

and the resulting replica symmetric entropy averaged 
over the distribution (^) may be calculated as outlined 
for the case of uncorrelated payoffs above. Again in the 
limit v — > the corresponding saddle point equations de- 
scribe optimal strategies. For c c = c r = c the optimal 
payoff is zero as a result of the symmetry of (^) under 
the exchange of players. Figure || shows the fraction 9(0) 
of strategies played with non-zero probability in optimal 
strategies as a function of c. 9(0) decreases with increas- 
ing c: At positive c there are strategies which tend to be 
beneficial for player X whatever the response of the op- 
ponent. As a result X concentrates on a smaller fraction 
of his strategies and vice versa for negative c. 

In the asymmetric case c r — c, c c — however a non- 
zero value of the game is possible. The resulting value 
for v c and the fraction of strategies played with non-zero 
probability are shown in figure [| Again the correlation 
between payoffs in the same row of the payoff matrix 
lead to strategies which tend to be either beneficial or 
detrimental to player X. 
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FIG. 2. The fraction 8(0) strategies played with non-zero 
probability as a function of c = c r = c c . 

By admitting only the beneficial ones into his mixed 
strategies, X may achieve a positive payoff. The fraction 
of strategies played with non-zero probability decreases 
accordingly. For negative c the entries in the same rows 
of the payoff matrix are anticorrelated, so different re- 
sponses of player Y to the same strategy of player X 
tend to lead to different payoffs. This situation leads to 
a negative value of the game. 
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FIG. 3. The optimal payoff v c (full) and the fraction #(0) of 
strategies played with non-zero probability (dotted) against 



: Cr clt Cc 



0. 



The simulation results shown in figures 1-3 were ob- 
tained using the simplex-algorithm to solve the linear 
programming problem Q defined by (|l|) for a system of 
size iV = 200 averaged over 200 payoff matrices with 
Gaussian distributed inputs flllf.The numerical results 



show very good agreement with the analytical expres- 
sions. 

In conclusion we have shown that techniques from 
the statistical mechanics of disordered systems may be 
used to analyze the statistical properties of optimal so- 
lutions of matrix games with random payoffs. Self- 
averaging "macroscopic" quantities such as the value of 
the game were identified and calculated for various proba- 
bility distributions. These quantities include the fraction 
of strategies played with non-zero probability. Further 
problems in matrix games which may be treated using 
these methods include the effects of deviating from the 
optimal strategy and the influence of perturbations of 
the payoff matrix on the optimal strategy, which forms 
the basis of the justification for the full stability of mixed 
equilibria |T^| . Furthermore work is in progress on a field 
of interest to current mathematical game theory, the sta- 
tistical description of Nash-equilibria in bimatrix games. 
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